Orthogonal Nonnegative Matrix Tri-factorization for Semi-supervised Document Co-clustering

نویسندگان

  • Huifang Ma
  • Weizhong Zhao
  • Qing Tan
  • Zhongzhi Shi
چکیده

Semi-supervised clustering is often viewed as using labeled data to aid the clustering process. However, existing algorithms fail to consider dual constraints between data points (e.g. documents) and features (e.g. words). To address this problem, in this paper, we propose a novel semi-supervised document co-clustering model OSS-NMF via orthogonal nonnegative matrix tri-factorization. Our model incorporates prior knowledge both on document and word side to aid the new wordcategory and document-cluster matrices construction. Besides, we prove the correctness and convergence of our model to demonstrate its mathematical rigorous. Our experimental evaluations show that the proposed document clustering model presents remarkable performance improvements with certain constraints.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Projected Alternating Least square Approach for Computation of Nonnegative Matrix Factorization

Nonnegative matrix factorization (NMF) is a common method in data mining that have been used in different applications as a dimension reduction, classification or clustering method. Methods in alternating least square (ALS) approach usually used to solve this non-convex minimization problem.  At each step of ALS algorithms two convex least square problems should be solved, which causes high com...

متن کامل

Orthogonal Nonnegative Matrix Factorization for Multi-type Relational Clustering

Relational clustering with heterogeneous data objects has impact in various important applications, such as web mining, text mining and bioinformatics etc. In this paper, we build a star-structured general model for relational clustering. It is formulated as an orthogonal tri-nonnegative matrix factorization. The model performs matrix approximation among all different data types to look for hid...

متن کامل

Nonnegative Matrix Factorization with Orthogonality Constraints

Nonnegative matrix factorization (NMF) is a popular method for multivariate analysis of nonnegative data, the goal of which is to decompose a data matrix into a product of two factor matrices with all entries in factor matrices restricted to be nonnegative. NMF was shown to be useful in a task of clustering (especially document clustering), but in some cases NMF produces the results inappropria...

متن کامل

Orthogonal Nonnegative Matrix Factorization: Multiplicative Updates on Stiefel Manifolds

Nonnegative matrix factorization (NMF) is a popular method for multivariate analysis of nonnegative data, the goal of which is decompose a data matrix into a product of two factor matrices with all entries in factor matrices restricted to be nonnegative. NMF was shown to be useful in a task of clustering (especially document clustering). In this paper we present an algorithm for orthogonal nonn...

متن کامل

Sentiment Classification with Graph Co-Regularization

Sentiment classification aims to automatically predict sentiment polarity (e.g., positive or negative) of user-generated sentiment data (e.g., reviews, blogs). To obtain sentiment classification with high accuracy, supervised techniques require a large amount of manually labeled data. The labeling work can be time-consuming and expensive, which makes unsupervised (or semisupervised) sentiment a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010